AITopics | exponential convergence rate

Collaborating Authors

exponential convergence rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bayesian Optimization with Exponential Convergence

Kenji Kawaguchi, Leslie Pack Kaelbling, Tomás Lozano-Pérez

Neural Information Processing SystemsOct-2-2025, 00:24:03 GMT

This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the δ -cover sampling. Most Bayesian optimization methods require auxiliary optimization: an additional non-convex global optimization problem, which can be time-consuming and hard to implement in practice. Also, the existing Bayesian optimization method with exponential convergence [ 1] requires access to the δ -cover sampling, which was considered to be impractical [ 1, 2]. Our approach eliminates both requirements and achieves an exponential convergence rate.

algorithm, optimization, procedure, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Genre: Research Report (0.68)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Convergence Rate in Nonlinear Two-Time-Scale Stochastic Approximation with State (Time)-Dependence

Chen, Zixi, Xu, Yumin, Zhang, Ruixun

arXiv.org Artificial IntelligenceSep-16-2025

The nonlinear two-time-scale stochastic approximation is widely studied under conditions of bounded variances in noise. Motivated by recent advances that allow for variability linked to the current state or time, we consider state- and time-dependent noises. We show that the Lyapunov function exhibits polynomial convergence rates in both cases, with the rate of polynomial delay depending on the parameters of state- or time-dependent noises. Notably, if the state noise parameters fully approach their limiting value, the Lyapunov function achieves an exponential convergence rate. We provide two numerical examples to illustrate our theoretical findings in the context of stochastic gradient descent with Polyak-Ruppert averaging and stochastic bilevel optimization.

artificial intelligence, machine learning, optimization problem, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1609/aaai.v39i15.33756

2509.11039

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Exponential convergence rate for Iterative Markovian Fitting

Sokolov, Kirill, Korotin, Alexander

arXiv.org Artificial IntelligenceAug-12-2025

Two distributions µ, ν P ( X) with everywhere positive density are given. Recently the IMF algorithm [4] was proposed to solve problem (1), which consists of successive transformations interpreted as projections onto the sets of Markov and q -reciprocal processes (see [3, null2.5]): Here we for the first time prove exponential convergence of IMF . We rely on convergence analysis of iterations [1] minimizing a strongly convex function with a Lipschitz gradient. We recall from [3, Theorem 3.1] that the solution p The work was supported by the grant for research centers in the field of AI provided by the Ministry of Economic Development of the Russian Federation in accordance with the agreement 000000C313925P4F0002 and the agreement with Skoltech 139-10-2025-033.

artificial intelligence, exponential convergence rate, iterative markovian fitting, (9 more...)

arXiv.org Artificial Intelligence

2508.0277

Country:

Europe > Russia (0.34)
Asia > Russia (0.25)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback

Reviews: Maximum Entropy Monte-Carlo Planning

Neural Information Processing SystemsJan-25-2025, 02:42:33 GMT

This paper proposes a new MCTS algorithm, Maximum Entropy for Tree Search (MENTS), which combines the maximum entropy policy optimization framework with MCTS for more efficient online planning in sequential decision problems. The main idea is to replace the Monte Carlo value estimate with the softmax value estimate as in the maximum entropy policy optimization framework, such that the state value can be estimated and back-propagated more efficiently in the search tree. Another main novelty is that it proposes an optimal algorithm, Empirical Exponential Weight (E2W), to be the tree policy to do more exploration. It shows that MENTS can achieve an exponential convergence rate towards finding the optimal action at the root of the tree, which is much faster than the polynomial convergence rate of the UCT method. The experimental results also demonstrate that MENTS performs significantly better than UCT in terms of sample efficiency, in both synthetic problems and Atari games.

algorithm, convergence rate, maximum entropy monte-carlo planning, (9 more...)

Neural Information Processing Systems

Industry: Leisure & Entertainment > Games > Computer Games (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.88)

Add feedback

Bayesian Optimization with Exponential Convergence Kenji Kawaguchi Leslie Pack Kaelbling Tomás Lozano-Pérez MIT

Neural Information Processing SystemsMar-12-2024, 20:45:17 GMT

This paper presents a Bayesian optimization method with exponential convergence without the need of auxiliary optimization and without the δ-cover sampling. Most Bayesian optimization methods require auxiliary optimization: an additional non-convex global optimization problem, which can be time-consuming and hard to implement in practice. Also, the existing Bayesian optimization method with exponential convergence [1] requires access to the δ-cover sampling, which was considered to be impractical [1, 2]. Our approach eliminates both requirements and achieves an exponential convergence rate.

algorithm, optimization, procedure, (13 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.15)

Genre: Research Report (0.68)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

Global $\mathcal{L}^2$ minimization at uniform exponential rate via geometrically adapted gradient descent in Deep Learning

Chen, Thomas

arXiv.org Machine LearningDec-31-2023

We consider the gradient descent flow widely used for the minimization of the $\mathcal{L}^2$ cost function in Deep Learning networks, and introduce two modified versions; one adapted for the overparametrized setting, and the other for the underparametrized setting. Both have a clear and natural invariant geometric meaning, taking into account the pullback vector bundle structure in the overparametrized, and the pushforward vector bundle structure in the underparametrized setting. In the overparametrized case, we prove that, provided that a rank condition holds, all orbits of the modified gradient descent drive the $\mathcal{L}^2$ cost to its global minimum at a uniform exponential convergence rate; one thereby obtains an a priori stopping time for any prescribed proximity to the global minimum. We point out relations of the latter to sub-Riemannian geometry.

geometrically adapted gradient descent, gradient flow, orbit, (12 more...)

arXiv.org Machine Learning

2311.15487

Country:

North America > United States > Texas > Travis County > Austin (0.14)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Elliptic PDE learning is provably data-efficient

Boullé, Nicolas, Halikias, Diana, Townsend, Alex

arXiv.org Artificial IntelligenceSep-19-2023

PDE learning is an emerging field that combines physics and machine learning to recover unknown physical systems from experimental data. While deep learning models traditionally require copious amounts of training data, recent PDE learning techniques achieve spectacular results with limited data availability. Still, these results are empirical. Our work provides theoretical guarantees on the number of input-output training pairs required in PDE learning. Specifically, we exploit randomized numerical linear algebra and PDE theory to derive a provably data-efficient algorithm that recovers solution operators of 3D uniformly elliptic PDEs from input-output data and achieves an exponential convergence rate of the error with respect to the size of the training dataset with an exceptionally high probability of success.

algorithm, solution operator, training dataset, (10 more...)

arXiv.org Artificial Intelligence

doi: 10.1073/pnas.2303904120

2302.12888

Country:

North America > United States > New York > Tompkins County > Ithaca (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Case of Exponential Convergence Rates for SVM

Cabannes, Vivien, Vigogna, Stefano

arXiv.org Artificial IntelligenceMay-22-2023

Classification is often the first problem described in introductory machine learning classes. Generalization guarantees of classification have historically been offered by Vapnik-Chervonenkis theory. Yet those guarantees are based on intractable algorithms, which has led to the theory of surrogate methods in classification. Guarantees offered by surrogate methods are based on calibration inequalities, which have been shown to be highly sub-optimal under some margin conditions, failing short to capture exponential convergence phenomena. Those "super" fast rates are becoming to be well understood for smooth surrogates, but the picture remains blurry for non-smooth losses such as the hinge loss, associated with the renowned support vector machines. In this paper, we present a simple mechanism to obtain fast convergence rates and we investigate its usage for SVM. In particular, we show that SVM can exhibit exponential convergence rates even without assuming the hard Tsybakov margin condition.

artificial intelligence, exponential convergence rate, machine learning, (1 more...)

arXiv.org Artificial Intelligence

2205.10055

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.53)

Add feedback

Disambiguation of weak supervision with exponential convergence rates

Cabannes, Vivien, Bach, Francis, Rudi, Alessandro

arXiv.org Artificial IntelligenceFeb-4-2021

In many applications of machine learning, such as recommender systems, where an input characterizing a user should be matched with a target representing an ordering of a large number of items, accessing fully supervised data (,) is not an option. Instead, one should expect weak information on the target, which could be a list of previously taken (if items are online courses), watched (if items are plays), etc., items by a user characterized by the feature vector. This motivates weakly supervised learning, aiming at learning a mapping from inputs to targets in such a setting where tools from supervised learning can not be applied off-the-shelves. Recent applications of weakly supervised learning showcase impressive results in solving complex tasks such as action retrieval on instructional videos (Miech et al., 2019), image semantic segmentation (Papandreou et al., 2015), salient object detection (Wang et al., 2017), 3D pose estimation (Dabral et al., 2018), text-to-speech synthesis (Jia et al., 2018), to name a few. However, those applications of weakly supervised learning are usually based on clever heuristics, and theoretical foundations of learning from weakly supervised data are scarce, especially when compared to statistical learning literature on supervised learning (Vapnik, 1995; Boucheron et al., 2005; Steinwart and Christmann, 2008). We aim to provide a step in this direction. In this paper, we focus on partial labelling, a popular instance of weak supervision, approached with a structured prediction point of view Ciliberto et al. (2020). We detail this setup in Section 2. Our contributions are organized as follows.

algorithm, assumption, supervision, (13 more...)

arXiv.org Artificial Intelligence

2102.02789

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (0.50)
Instructional Material > Course Syllabus & Notes (0.34)

Industry:

Education > Educational Technology (0.88)
Education > Educational Setting > Online (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.48)

Add feedback

Stochastic Gradient Descent with Exponential Convergence Rates of Expected Classification Errors

Nitanda, Atsushi, Suzuki, Taiji

arXiv.org Machine LearningJun-14-2018

We consider stochastic gradient descent for binary classification problems in a reproducing kernel Hilbert space. In traditional analysis, it is known that the expected classification error converges more slowly than the expected risk even when assuming a low-noise condition on the conditional label probabilities. Consequently, the resulting rate is sublinear. Therefore, it is important to consider whether much faster convergence of the expected classification error can be achieved. In recent research, an exponential convergence rate for stochastic gradient descent was shown under a strong low-noise condition, but theoretical analysis of this was limited to the square loss function, which is somewhat inadequate for binary classification tasks. In this paper, we show an exponential convergence rate of the expected classification error in the final phase of learning for a wide class of differentiable convex loss functions under similar assumptions.

artificial intelligence, machine learning, stochastic gradient descent, (16 more...)

arXiv.org Machine Learning

1806.05438

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Add feedback